智能论文笔记

More Recent Advances in (Hyper)Graph Partitioning

Ümit V. Çatalyürek , Karen D. Devine , Marcelo Fonseca Faraj , Lars Gottesbüren , Tobias Heuer , Henning Meyerhenke , Peter Sanders , Sebastian Schlag , Christian Schulz , Daniel Seemaier

分类：机器学习

2022-05-26

近年来，在平衡（超级）图分配算法的设计和评估中取得了重大进展。我们调查了过去十年的实用算法的趋势，用于平衡（超级）图形分区以及未来的研究方向。我们的工作是对先前有关该主题的调查的更新。特别是，该调查还通过涵盖了超图形分区和流算法来扩展先前的调查，并额外关注并行算法。

translated by 谷歌翻译

A Physics-Informed Neural Network to Model Port Channels

Marlon S. Mathias , Marcel R. de Barros , Jefferson F. Coelho , Lucas P. de Freitas , Felipe M. Moreno , Caio F. D. Netto , Fabio G. Cozman , Anna H. R. Costa , Eduardo A. Tannuri , Edson S. Gomi

分类：机器学习

2022-12-20

We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - S\~ao Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the governing equations in sample points. In this work, our flow is governed by the Navier-Stokes equations with some approximations. There are two main novelties in this paper. First, we design our model to assume that the flow is periodic in time, which is not feasible in conventional simulation methods. Second, we evaluate the benefit of resampling the function evaluation points during training, which has a near zero computational cost and has been verified to improve the final model, especially for small batch sizes. Finally, we discuss some limitations of the approximations used in the Navier-Stokes equations regarding the modeling of turbulence and how it interacts with PINNs.

translated by 谷歌翻译

Audiovisual Masked Autoencoders

Mariana-Iuliana Georgescu , Eduardo Fonseca , Radu Tudor Ionescu , Mario Lucic , Cordelia Schmid , Anurag Arnab

分类：计算机视觉

2022-12-09

Can we leverage the audiovisual information already present in video to improve self-supervised representation learning? To answer this question, we study various pretraining architectures and objectives within the masked autoencoding framework, motivated by the success of similar methods in natural language and image understanding. We show that we can achieve significant improvements on audiovisual downstream classification tasks, surpassing the state-of-the-art on VGGSound and AudioSet. Furthermore, we can leverage our audiovisual pretraining scheme for multiple unimodal downstream tasks using a single audiovisual pretrained model. We additionally demonstrate the transferability of our representations, achieving state-of-the-art audiovisual results on Epic Kitchens without pretraining specifically for this dataset.

translated by 谷歌翻译

GET-DIPP: Graph-Embedded Transformer for Differentiable Integrated Prediction and Planning

Jiawei Sun , Chengran Yuan , Shuo Sun , Zhiyang Liu , Terence Goh , Anthony Wong , Keng Peng Tee , Marcelo H. Ang Jr

分类：机器人

2022-11-11

Accurately predicting interactive road agents' future trajectories and planning a socially compliant and human-like trajectory accordingly are important for autonomous vehicles. In this paper, we propose a planning-centric prediction neural network, which takes surrounding agents' historical states and map context information as input, and outputs the joint multi-modal prediction trajectories for surrounding agents, as well as a sequence of control commands for the ego vehicle by imitation learning. An agent-agent interaction module along the time axis is proposed in our network architecture to better comprehend the relationship among all the other intelligent agents on the road. To incorporate the map's topological information, a Dynamic Graph Convolutional Neural Network (DGCNN) is employed to process the road network topology. Besides, the whole architecture can serve as a backbone for the Differentiable Integrated motion Prediction with Planning (DIPP) method by providing accurate prediction results and initial planning commands. Experiments are conducted on real-world datasets to demonstrate the improvements made by our proposed method in both planning and prediction accuracy compared to the previous state-of-the-art methods.

translated by 谷歌翻译

Determining Accessible Sidewalk Width by Extracting Obstacle Information from Point Clouds

Cláudia Fonseca Pinhão , Chris Eijgenstein , Iva Gornishka , Shayla Jansen , Diederik M. Roijers , Daan Bloembergen

分类：计算机视觉

2022-11-08

Obstacles on the sidewalk often block the path, limiting passage and resulting in frustration and wasted time, especially for citizens and visitors who use assistive devices (wheelchairs, walkers, strollers, canes, etc). To enable equal participation and use of the city, all citizens should be able to perform and complete their daily activities in a similar amount of time and effort. Therefore, we aim to offer accessibility information regarding sidewalks, so that citizens can better plan their routes, and to help city officials identify the location of bottlenecks and act on them. In this paper we propose a novel pipeline to estimate obstacle-free sidewalk widths based on 3D point cloud data of the city of Amsterdam, as the first step to offer a more complete set of information regarding sidewalk accessibility.

translated by 谷歌翻译

Statistical Learning and Inverse Problems: A Stochastic Gradient Approach

Yuri R. Fonseca , Yuri F. Saporito

分类： (统计)机器学习 | 机器学习

2022-09-29

Inverse problems are paramount in Science and Engineering. In this paper, we consider the setup of Statistical Inverse Problem (SIP) and demonstrate how Stochastic Gradient Descent (SGD) algorithms can be used in the linear SIP setting. We provide consistency and finite sample bounds for the excess risk. We also propose a modification for the SGD algorithm where we leverage machine learning methods to smooth the stochastic gradients and improve empirical performance. We exemplify the algorithm in a setting of great interest nowadays: the Functional Linear Regression model. In this case we consider a synthetic data example and examples with a real data classification problem.

translated by 谷歌翻译

Face Super-Resolution Using Stochastic Differential Equations

Marcelo dos Santos , Rayson Laroca , Rafael O. Ribeiro , João Neves , Hugo Proença , David Menotti

分类：计算机视觉

2022-09-24

传播模型已被证明对各种应用程序有效，例如图像，音频和图形生成。其他重要的应用是图像超分辨率和逆问题的解决方案。最近，一些作品使用了随机微分方程（SDE）将扩散模型推广到连续时间。在这项工作中，我们介绍SDE来生成超分辨率的面部图像。据我们所知，这是SDE首次用于此类应用程序。所提出的方法比基于扩散模型的现有超级分辨率方法提供了改进的峰值信噪比（PSNR），结构相似性指数（SSIM）和一致性。特别是，我们还评估了该方法在面部识别任务中的潜在应用。通用面部特征提取器用于比较超分辨率图像与地面真相，并获得了与其他方法相比，获得了卓越的结果。我们的代码可在https://github.com/marcelowds/sr-sde上公开获取

translated by 谷歌翻译

Visual Localization and Mapping in Dynamic and Changing Environments

João Carlos Virgolino Soares , Vivian Suzano Medeiros , Gabriel Fischer Abati , Marcelo Becker , Glauco Caurin , Marcelo Gattass , Marco Antonio Meggiolaro

分类：机器人

2022-09-21

完全自主移动机器人的现实部署取决于能够处理动态环境的强大的大满贯（同时本地化和映射）系统，其中对象在机器人的前面移动以及不断变化的环境，在此之后移动或更换对象。机器人已经绘制了现场。本文介绍了更换式SLAM，这是一种在动态和不断变化的环境中强大的视觉猛烈抨击的方法。这是通过使用与长期数据关联算法结合的贝叶斯过滤器来实现的。此外，它采用了一种有效的算法，用于基于对象检测的动态关键点过滤，该对象检测正确识别了不动态的边界框中的特征，从而阻止了可能导致轨道丢失的功能的耗竭。此外，开发了一个新的数据集，其中包含RGB-D数据，专门针对评估对象级别的变化环境，称为PUC-USP数据集。使用移动机器人，RGB-D摄像头和运动捕获系统创建了六个序列。这些序列旨在捕获可能导致跟踪故障或地图损坏的不同情况。据我们所知，更换 - 峰是第一个对动态和不断变化的环境既有坚固耐用的视觉大满贯系统，又不假设给定的相机姿势或已知地图，也能够实时运行。使用基准数据集对所提出的方法进行了评估，并将其与其他最先进的方法进行了比较，证明是高度准确的。

translated by 谷歌翻译

Evaluating Temporal Patterns in Applied Infant Affect Recognition

Allen Chang , Lauren Klein , Marcelo R. Rosales , Weiyang Deng , Beth A. Smith , Maja J. Matarić

分类：人工智能

2022-09-07

代理商必须连续监视其伴侣的情感状态，以了解和参与社交互动。但是，评估情感识别的方法不能说明在情感状态之间的阻塞或过渡期间可能发生的分类绩效的变化。本文解决了在婴儿机器人相互作用的背景下影响分类表现的时间模式，在这种情况下，婴儿的情感状态有助于他们参与治疗性腿部运动活动的能力。为了支持视频记录中面部遮挡的鲁棒性，我们训练了婴儿使用面部和身体功能的识别分类器。接下来，我们对表现最佳模型进行了深入的分析，以评估随着模型遇到丢失的数据和不断变化的婴儿影响，性能如何随时间变化。在高度信心提取功能的时间窗口期间，经过训练的面部功能的单峰模型与在面部和身体特征训练的多模式模型相同的最佳性能。但是，在整个数据集上评估时，多模型模型的表现优于单峰模型。此外，在预测情感状态过渡并在对同一情感状态进行多个预测后改善时，模型性能是最弱的。这些发现强调了将身体特征纳入婴儿的连续影响识别的好处。我们的工作强调了随着时间的流逝和在存在丢失的数据的存在时，评估模型性能变异性的重要性。

translated by 谷歌翻译

A First Look at Dataset Bias in License Plate Recognition

Rayson Laroca , Marcelo Santos , Valter Estevam , Eduardo Luz , David Menotti

分类：计算机视觉

2022-08-23

公共数据集在推进车牌识别（LPR）的最新技术方面发挥了关键作用。尽管数据集偏见在计算机视觉社区中被认为是一个严重的问题，但在LPR文献中很大程度上忽略了它。 LPR模型通常在每个数据集上进行训练和评估。在这种情况下，他们经常在接受培训的数据集中证明了强大的证明，但在看不见的数据集中表现出有限的性能。因此，这项工作研究了LPR上下文中的数据集偏差问题。我们在八个数据集上进行了实验，在巴西收集了四个，在中国大陆进行了实验，并观察到每个数据集都有一个独特的，可识别的“签名”，因为轻量级分类模型预测了车牌（LP）图像的源数据集，其图像的源95％的精度。在我们的讨论中，我们提请人们注意以下事实：大多数LPR模型可能正在利用此类签名，以以失去概括能力为代价，以改善每个数据集中的结果。这些结果强调了评估跨数据库设置中LPR模型的重要性，因为它们提供了比数据库内部的更好的概括（因此实际性能）。

translated by 谷歌翻译